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ORIGINAL ARTICLE 

Genome-wide meta-analysis identifies six novel loci associated 
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Coffee, a major dietary source of caffeine, is among the most widely consumed beverages in the world and has received 
considerable attention regarding health risks and benefits. We conducted a genome-wide (GW) meta-analysis of predominately 
regular-type coffee consumption (cups per day) among up to 91 462 coffee consumers of European ancestry with top 
single-nucleotide polymorphisms (SNPs) followed-up in -30 062 and 7964 coffee consumers of European and African-American 
ancestry, respectively. Studies from both stages were combined in a trans-ethnic meta-analysis. Confirmed loci were examined for 
putative functional and biological relevance. Eight loci, including six novel loci, met GW significance (logioBayes factor (BF)>5.64) 
with per-allele effect sizes of 0.03-0.14 cups per day. Six are located in or near genes potentially involved in pharmacokinetics 
{ABCG2, AHR, POR and CYP1A2) and pharmacodynamics {BDNF and SLC6A4) of caffeine. Two map to GCKR and MLXIPL genes related 
to metabolic traits but lacking known roles in coffee consumption. Enhancer and promoter histone marks populate the regions of 
many confirmed loci and several potential regulatory SNPs are highly correlated with the lead SNP of each. SNP alleles near GCKR, 
MLXIPL, BDNF and CYP1A2 that were associated with higher coffee consumption have previously been associated with smoking 
initiation, higher adiposity and fasting insulin and glucose but lower blood pressure and favorable lipid, inflammatory and liver 
enzyme profiles (P< 5x 10~^).Our genetic findings among European and African-American adults reinforce the role of caffeine 
in mediating habitual coffee consumption and may point to molecular mechanisms underlying inter-individual variability in 
pharmacological and health effects of coffee. 

Molecular Psychiatry advance online publication, 7 October 2014; doi:10.1038/mp.2014.107 



INTRODUCTION 

Coffee is among the most widely consumed beverages in the 
world.^ North American coffee drinkers typically consume ~ 2 cups 
per day while the norm is at least 4 cups in many European 
countries.^ In prospective cohort studies, coffee consumption is 
consistently associated with lower risk of Parkinson's disease, liver 
disease and type 2 diabetes.^ However, the effects of coffee on 
cancer development, cardiovascular and birth outcomes and other 
health conditions remain controversial.^ For most populations, 
coffee is the primary source of caffeine, a stimulant also present in 
other beverages, foods and medications.^'^ The fifth edition of the 
Diagnostic and Statistical Manual of Mental Disorders does not 
include a diagnosis of caffeine dependence or abuse due to a 
paucity of evidence but lists caffeine intoxication and withdrawal 
as disorders."^ Knowledge of factors contributing to coffee's 
consumption and physiological effects may greatly advance the 



design and interpretation of population and clinical research on 
coffee and caffeine.^ Genetic factors could be especially valuable 
as they offer ways to study the potential health effects of coffee 
via instrumental variables or gene-environment interactions.^ 
Heritability estimates for coffee and caffeine use range between 
36 and 58%.^ Genome-wide association studies (GWAS) of 
habitual caffeine and coffee intake have identified variants near 
CYP1A2 and aryl hydrocarbon receptor (AHR)7~^ Cytochrome P450 
(CYP)1A2 is responsible for -95% of caffeine metabolism in 
humans and AHR has a regulatory role in basal and substrate- 
induced expression of target genes, including CYPlAl and 

To identify additional loci, we conducted a staged genome- 
wide (GW) meta-analysis of coffee consumption including over 
120 000 coffee consumers sourced from population-based studies 
of European and African-American ancestry. 
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MATERIALS AND METHODS 

Study design and populations 

Supplementary Figure SI depicts an overview of the current study. We 
performed a meta-analysis of GWAS summary statistics from 28 population- 
based studies of European ancestry to detect single-nucleotide polymorph- 
isms (SNPs) that are associated with coffee consumption. Top loci were 
followed-up in studies of European (13 studies) and African-American (7 
studies) ancestry and confirmed loci were explored in a single Pakistani 
population. Detailed information on study design, participant characteristics, 
genotyping and imputation for all contributing studies are provided in the 
Supplementary Information and Supplementary Tables S1-S6. 

Phenotype 

All phenotype data were previously collected via interviewer- or self- 
administered questionnaires (Supplementary Table SI). Our primary 
phenotype ('phenotype 1') was cups of predominately regular-type coffee 
consumed per day among coffee consumers. Coffee data collected 
categorically (for example, 2-3 cups per day) were converted to cups 
per day by taking the median value of each category (for example, 2.5 cups 
per day). A secondary analysis was performed comparing high with 
infrequent/non-coffee consumers ('phenotype 2'). A subset of stage 1 
studies collected information on decaffeinated coffee consumption; which 
was examined in follow-up analysis of the confirmed loci. 

Statistical analysis 

Each stage 1 (discovery) study performed GWA testing for each phenotype 
across -2.5 million genotyped or imputed autosomal SNPs (HapMap II, 
Centre d'Etude du Polymorphisme Humain (CEU) reference), based on 
linear (cups per day, phenotype 1) or logistic (high vs none/low, phenotype 
2) regression under an additive genetic model. Analyses were adjusted for 
age, smoking status and, when applicable, sex, case-control status, study 
site, family structure and/or study-specific principal components of 
population substructure (Supplementary Table S7). SNPs with minor allele 
frequency < 0.02 or with low imputation quality scores were removed 
before meta-analysis (Supplementary Table S5). The GWAtoolbox (see 
Supplementary Information for URLs) was used for initial quality control. 
Minor allele frequencies and a plot comparing (1 /median standard error of 
effect size) vs (square root of sample size) for each study were also reviewed 
for outliers and these were addressed before the final meta-analysis. 

For both phenotypes, GW meta-analysis was conducted using a fixed- 
effects model and inverse-variance weighting with a single genomic 
control correction as implemented in METAL^^ and GWAMA^^ (r>0.99 for 
correlation between METAL and GWAMA results). The phenotypic variance 
explained by additive SNP effects was estimated in the Women's Genome 
Health Study (WGHS, n = 15 987 with identity-by-state < 0.025) using 
GCTA.^"^ Stage 1 summary statistics were also subjected to pathway 
analysis using MAGENTA^^ (Supplementary Information). 

For regions achieving association P-values <5x10~^ (7p21, 7q23.11, 
11 pi 3 and 15q24), we performed conditional analysis using the summary 
statistics from the meta-analysis to test for the association of each SNP 
while conditioning on the top SNPs, with correlations between SNPs due 
to linkage disequilibrium (LD) estimated from the imputed genotype data 
from the Atherosclerosis Risk in Communities cohort,^^ a large and 
representative cohort of men and women of European ancestry. 

Our approach to select SNPs for replication (stage 2) is described in 
Supplementary Information. Stage 2 meta-analyses were performed 
separately for European and African-American populations, using the 
same statistical models and methods as described for stage 1, but without 
genomic control (Supplementary Information). 

Studies from all stages were included in an overall meta-analysis using 
MANTRA (Meta-ANalysis of TRans-ethnic Association) studies which 
adopts a Bayesian framework to combine results from different ethnic 
groups by taking advantage of the expected similarity in allelic effects 
between the most closely related populations. MANTRA was limited to 
SNPs selected for replication thus no genomic control was applied. A 
random-effects analysis using GWAMA was performed in parallel to obtain 
effect estimates, which are not generated by MANTRA. The GW- 
significance threshold of logio BF >5.64 approximates a traditional GW 
P-value threshold of 5x10"^ under general assumptions.^^'^^ Subgroup 
analysis and meta-regression were performed to investigate possible 
sources of between-study heterogeneity (Supplementary Information). 

© 2014 Macmillan Publishers Limited 



Fine-mapping. To assess the improvement in fine-mapping resolution 
due to trans-ethnic meta-analysis, we applied the methods of Franceschini 
et aiy to stage 1 and stage 2 (African Americans only) GW-summary level 
data (Supplementary Information). 

Potential SNP function and biological and clinical inferences 
Details pertaining to follow-up of confirmed loci are provided in the 
Supplementary Information. Briefly, all confirmed index SNPs and their 
correlated proxies were examined for putative function using publicly 
available resources. Bioinformatics and computational tools were used to 
systematically mine available knowledge and experimental databases to 
inform biological hypotheses underlying the link between loci and coffee 
consumption as well as connections between loci. For these analyses all 
genes mapping to the confirmed regions were considered as potential 
candidates. Finally, we searched the National Human Genome Research 
Institute GWAS catalog^° and Metabolomics GWAS server^^ for all GW-signifi- 
cant associations with our confirmed coffee SNPs. Complete GWAS summary 
data for coffee-implicated diseases or traits were additionally queried. 

RESULTS 

SNPs associated with coffee consumption 

Discovery stage. Results from the discovery stage are summarized 
in Supplementary Figures S2-S5. Little evidence for genomic inflation 
(A < 1 .07) was observed for either phenotype. The two analyses 
yielded similarly ranked loci and significant enrichment of 'xenobiotic' 
genes (MAGENTA's FDR < 0.006), suggesting no major difference in 
the genetic influence on coffee drinking initiation compared with the 
level of coffee consumption among coffee consumers at these loci. 
Overall, -7.1% (standard error: 2%) of the variance in coffee cups 
consumed per day (phenotype 1) could be explained by additive and 
common SNP effects in the WGHS. 

Conditioning on the index SNPs of each region achieving 
association P-values < 5 x 10"^ (7p21, 7q23.1 1, 1 1p13 and 15q24) 
in the discovery stage provided little evidence for multiple 
independent variants (Supplementary Figure S6). Only four of 
the SNPs on chromosome 7 were potentially independent and 
carried forward with other promising SNPs. 

Replication and trans-ethnic meta-analysis. Forty-four SNPs span- 
ning thirty-three genomic regions met significance criteria for 
candidate associations and were followed-up in stage 2 
(Supplementary Tables S8-S13). Eight loci, including six novel, 
met our criteria for GW significance (logio BF>5.64) in a trans- 
ethnic meta-analysis of all discovery and replication studies 
(Table 1; Supplementary Tables S14-S16; Supplementary Figures 
S7 and S8). Confirmed loci have effect sizes of 0.03-0.14 cups 
per day per allele and together explain -1.3% of the phenotypic 
variance of coffee intake. We were underpowered to replicate these 
associations in a Pakistani population (Supplementary Information). 

Functional and biological inferences 

Enhancer (H3K4me1) and promoter (H3K4me3) histone marks 
densely populate many of these regions and several non- 
synonymous and potential regulatory SNPs are highly correlated 
(?>0.8) with the lead SNP and thus strong candidates for being a 
causal variant (Table 2; Supplementary Information; Supplementary 
Tables S17-S19). Candidate genes form a highly connected network 
of interactions, featuring discernible clusters of genes around brain- 
derived neurotrophin factor (BDNF) and AHR (Figure 1; 
Supplementary Information; Supplementary Tables S20 and S21). 
At least one gene in each of the eight regions (i) is highly expressed 
in brain, liver and/or taste buds, (ii) results in phenotype abnor- 
malities relevant to coffee consumption behavior when modified in 
mice and (iii) is differentially expressed in human hepatocytes when 
treated with high (7500 |jm) but not low (1500 |jm) doses of caffeine 
(Table 2; Supplementary Tables S22-S24). 

Molecular Psychiatry (2014), 1-10 



GWAS of coffee consumption 
The Coffee and Caffeine Genetics Consortium et al 



0} 

O 
u 

U) 

c 
o 
E 



Q. 
-Q 

E 



o 



o 

Q. 



i 



V/ 



V/ 
c 



QQ 



t^O^'^ro^t^Lnooro^MO^ 
I— I— t^'sfi— vorvit^t^i— 
'sTOvooO'sf^oorMrMoo 
a\vOv0^vOLOt^rovOvO 
rMrviT— rMT— T— rMT— T— (N 



t^rovooONOOOOt^LO 
OfNONOOO'— OONO 

odd ^* odd ^* do 



0000t^O^f^<NvOO^t^^ 
^OOOvOOO'-'^'^t^O^ 

vd vd 00 cri 00 in Lo t< rvj . o 



oooooooooo 
dddddddddd 

'sfvooomt^'sfrM^ro 
OOt-t-OOO o 
<D<DCD<DCDCD<DOO<5 

I III I I 



X X o 
(N ro 
d «-* (N 



o o o 

X XX 

in in ro (N 

ro in (N (N vo 

d t< d 'sf 00 d 



roinrMrMfNro'srt^mfo 
oooooooooo 
S-S-S-S-S-S-S-^ d d 



T— voo^ln^^lt^lnoo^^ 
q^. ooooOfNT-q 

I III I I 

^ O ^ no <- vO 00 

O .- O O .- .- O 

I I I I III 

P^j^OOOO ooo 

qqXXXX XXX 
vOOOOt— t^rvivoro 
vq t^^ (N p p 0^ (N T— ^ 

«— * (N ^* «— * d a\ ro a\ 



oooooooooo 

<D<D<D<D<D<D<D<D<D<D 



roroint^vomroON'— ro 
ooooooop^. o 
dddddddood 

I III I I 

ooLDLnO'-o^^o 
OOOOOOOOOO 



X 


X 


X 


X 


X 


X 


X X 


X 


X 


vo 


ro 


00 




(N 


vo 


O ON 


in 


vo 


p 




^. 


in 


00 


o 


"^t °0 




(N 








(N 


t< 


a\ 


ro vd 


vd 


(N 




















O 


O 


o 


o 


O 


o 


o o 


o 


O 


d 


d 


d 


d 


d 


d 


d d 


d 


d 




VO 




ro 


in 




in rsi 


in 




o o 






o 


o 


o 




O 


o 

1 


O 


d 

1 


d 

1 


d 

1 


d 


d o 

1 


d 


d 

1 




in 


(N 


ro 




ON 


vo 


vo 


o 






in 


ro 


vo 




O O 


o 


00 


d 




d 


d 


d 


d 


d d 


d 






1 






(N 








f 




00 


ro 


ro 




(N 


T— ro 


(N 


in 


d 


d 


d 


d 


d 


d 


d d 


d 


d 




H 


< 


P<P<PPP< 



a ^ 



q: ^ ^ ^ ^ 

S ? Q 5 5 5 

^ ^ cn ll 

5 ^ U U UJ 



rM o ^ ^ 

^0Ot^ln0^00^ft00^M^ 
O^OOOOvO^'^'^''^ 



ro f2 
ON ON 12 



CM 



^ ?? n ^ o ;b 

J2 "wi 



rM ^ ^ ON 00 V 
»- »- ^ vo r> w» 



O (N rM 

o 

^ "^f ON 
(N (N ox 

to to ij) 



'sf (N T- 

(N (N (N 

Q. cr Q. 

(N ^ 



ro ro 

f^. fv^ ^ 

T- T- T- (N 

^ ^ Q. cr 
O" O" T— in 

T— T— 



O) O) Ll_ ^ 



^ ^ ^ 

vo 4^ 

1 8 E S 



O 9 ^ g 



§- <5 ^ ^ 

Q_ O) 



o 

to 



to "D • — 

tP V4_ fU 

O) ^ o ^ 

^ o 2 

t-: +:! 



U to 

tt c 

O) o 

c ^ 
O 



a; 



o 

to 

■ Q. 

■ < 



< fu tj 5(1 

t c £ ^ 

a] O ^ 

u c -o OJ 

C O C Q) 

fu Q- 2 IB 

(- 5fl to 

C O) f- 

"3 t ^ "'^ 

- -D 



1 ^ CD 



cr a; TO 12 

U u O) o 

tt o c 

^ >^ ^ ^ 

= ^ Q- I 

Li_- U C <P 

< lyt'iZ <^ 
LU c c c 

d) ir to i: 

_ .. !:e 

t; Q- ^ 

It . 0^ ^ 

O) '-f to _ 

< ^ i2 E 

Ll_ .0) UJ O 



4-^ 01 



O) 01 



c S 

TO 



■H ^ ^5 

E TO CO o 

< 00 01 Q. 

I Q_ to to 

C 7 _>s 01 

< (J 4- 

< E -o 
E ^ TO 

0 ^ ^ ro 

1 i 

J2 >v X ^ 

< Q. rsi ^ 



Additional genomic characterization of the top loci allows 
further biological inference as follows: 

(i) Previously identified loci near AHR (7p21) and CYP1A2 (15q24). 
Consistent with previous reports in smaller samples/"^ the 
intergenic 7p21 and 15q24 loci near AHR and CYP1A1/CYP1A2 
respectively remained the most prominent and highly hetero- 
geneous loci associated with coffee consumption. The same 
index SNPs were identified in European and African Americans, 
suggesting that they are robust HapMap proxies for causal 
variants in these two populations. Cohort-wide mean coffee 
consumption explained part of the heterogeneity in study results 
for both loci (Supplementary Table S25; Supplementary 
Information). The rs2472297 T and rs4410790 C alleles associated 
with increased coffee consumption have recently been associated 
with lower plasma caffeine levels^^ and shown to increase 
CYP1 A2-mediated metabolism of olanzapine.^^ The C allele of 
rs4410790 is also positively correlated with cerebellum AHR 
methylation, suggesting a novel role of Ahr in motor or learning 
pathways that may trigger coffee consumption. The most 
significant variants at 15q24 reside in the CYP1A1-CYP1A2 
bidirectional promoter where AHR response elements have been 
identified and shown to be important for transcriptional activation 
of both CYP1A1 and CYP1A2.^^ The rs2472297 T variant putatively 
weakens the binding of SP1, a co-activator in the Ahr-Arnt 
complex regulating CYP1 locus transcription^"^ and is also 
implicated in the expression of several neighboring genes. The 
latter observation, together with this region's high LD and long 
range chromatin interactions (Supplementary Figure S9), suggests 
a regulatory network among these genes. 

(ii) Novel loci at 7qlh23 (POR) and 4q22 (ABCG2) likely function in 
caffeine metabolism. Variants at 7q11.23 (rs17685) and 4q22 
(rs1481012) map to novel yet biologically plausible candidate 
genes involved in xenobiotic metabolism. rs17685 maps to the 
3'UTR of POR, encoding P450 oxidoreductase which transfers 
electrons to all microsomal CYP450 enzymes.^^ The rs17685 A 
variant associated with higher coffee consumption is linked to 
increased POR expression and potentially weakens the DNA 
binding of several transcriptional regulatory proteins including 
BHLHE40, which inhibits POR expression.^^ The same SNP is in LD 
(CEU: r^ = 0.93) with PO/?^28 (rs1 057868 and AlaSOBVal), which is 
associated with differential CYP activity depending on the CYP 
isoform, substrate and experimental model used.^^ rs1481012 at 
4q22 maps to ABCG2, encoding a xenobiotic efflux transporter. 
rs1481012 is in LD (CEU: r^ = 0.92) with rs2231142 (Gln141Lys), a 
functional variant at an evolutionarily constrained residue.^^ 
However, fine-mapping of this region on the basis of reduced 
LD in the African-American sample limited an initial 189102-kb 
region to a credible span of 6249 kb (Supplementary Table SI 6) 
that excluded rs2231142. 

(Hi) Novel loci at 11 pi 3 (BDNF) and 17qll.2 CSLC6A4') likely mediate 
the positive reinforcing properties of coffee constituents. The index 
SNP at 11 pi 3 is the widely investigated missense mutation 
(rs6265 and Val66IVlet) in BDNF (Supplementary Table S26). BDNF 
modulates the activity of serotonin, dopamine and glutamate, and 
neurotransmitters involved in mood-related circuits and have a 
key role in memory and learning.^^ The IVlet66 allele impairs 
neuronal activity-dependent BDNF secretion^° and thus may 
attenuate the rewarding effects of coffee and, in turn, motivation 
to consume coffee. The increasingly recognized roles of BDNF in 
the chemosensory system and conditioned taste preferences may 
also be relevant.^^ The index SNP (rs9902453) at 17q1 1.2 maps to 
the EFCAB5 gene and is in LD (CEU: /^>0.8) with SNPs that alter 
regulatory motifs for AhR^^ in the neighboring gene NSRPl, but 
neither gene is an obvious candidate for coffee consumption. 
Upstream of rs9902453 lies a possibly stronger candidate: 5LC6A4 
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Potential function of loci associated with coffee consumption^ 
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Abbreviations: CEU, Centre d'Etude du Polymorphisme Humain; CR, conserved region; eQTL, expression quantitative trait loci; LD, linkage disequilibrium; 



mQTL, methylation quantitative trait loci; SNP, single-nucleotide polymorphism. ^See Supplementary Information for details and references to data resources. 
% vitro human hepatic gene expression in response to caffeine. Red and green font corresponds to increased and decreased expression, respectively. "^Lead 
SNP allele associated with higher coffee consumption. ^Check marks (/) denote the presence of non-synonymous SNPs in LD (CEU: with lead SNP 

(details provided for lead SNP only). ^Check marks (/) denote the presence of a conserved region (spanning lead SNP and its correlated proxies, CEU: 
^Check marks (/) denote the presence of DNAse hypersensitivity sites at region spanning lead SNP and its correlated proxies, CEU: ^Check marks (/) 

denote the presence of proteins bound at region spanning lead SNP and its correlated proxies, CEU: ^Enhancer (H3K4mel) or promoter (H3K4me3) 

histone marks (as defined by Ernst et al.^^) spanning lead SNP and its correlated proxies, CEU: r^ ^ 0.8. 'Regulatory motifs altered by lead SNR ^Expression QTLs 
for lead SNP or perfect proxy (CEU: r^= 1) derived from lymphoblastoid cell lines, blood, or liver, adipose and brain tissues. Red and green font corresponds to 
increased and decreased expression, respectively, relative to allele associated with higher coffee consumption. Direction of GITI expression is not available. 
"^Methylation QTLs for lead SNP derived from cerebellum and frontal cortex. Red and green font corresponds to increased and decreased expression, 
respectively, relative to allele associated with higher coffee consumption. 



encoding the serotonin transporter. Serotonergic neurotransmis- 
sion affects a wide range of behaviors including sensory 
processing and food intake.^^ 

(iv) Novel loci at 2p24 (GCKR) and 17qll.2 (MLXIPL). Variants at 
2p24 (rs1260326) and 7q11.23 (rs7800944) map to GCKR and 
MLXIPL, respectively. The former has been associated with plasma 
glucose and multiple metabolic traits and the latter with plasma 
triglycerides (Table 3; Supplementary Table 527). Adjustment of 
regression models for plasma lipids in the WGHS (n~ 17 000) and 
plasma glucose in TwinGene (n~8800) did not significantly 
change the relationship between SNPs at these two loci and 
coffee consumption (P>0.48, Supplementary Tables 528 and 529). 
The rs1 260326 T allele encodes a non-synonymous change in the 
encoded, glucokinase regulatory protein leading to increased 
hepatic glucokinase activity.^"^ Glucokinase regulatory protein and 
glucokinase may also cooperatively function in the glucose- 
sensing process of the brain^^ that may, in turn, influence central 
pathways responding to coffee constituents. A direct link between 
MLXIPL and coffee consumption remains unclear, except for the 
interactions with other candidate genes (Figure 1). Experimental 
evidence and results from formal prioritization analyses also 
warrants consideration of other candidates in these regions 
(Figure 1; Table 2; Supplementary Tables 523). For example, in the 
frontal cortex, the rs1 260326 allele positively associated with 
coffee consumption correlates with lower methylation of PPM 7 G; a 
putative regulatory target for AhR and binding target for PPP1R1B, 
which mediates psychostimulant effects of caffeine.^^ 

Pleiotropy and clinical inferences 

None of the eight loci was significantly associated with caffeine 
taste intensity (P>0.02) or caffeine-induced insomnia (P>0.08), 



according to previously published GWAS of these traits. SNPs 
near AHR associated with higher coffee consumption were also 
significantly associated with higher decaffeinated coffee con- 
sumption (-0.05 cups per day, P< 0.0004, n = 24 426); perhaps a 
result of Pavlovian conditioning among individuals moderating 
their intake of regular coffee or the small amounts of caffeine in 
decaffeinated coffee.^ 

Across phenotypes in the GWAS catalog,^° the alleles leading to 
higher coffee consumption at 2p24, 4q22, 7q11.23, 11p13 and 
15q24 have been associated with one or more of the following: 
smoking initiation, higher adiposity and fasting insulin and 
glucose but lower blood pressure and favorable lipid, inflamma- 
tory and liver enzyme profiles (P<5x10~^ Table 3; Supple- 
mentary Table S27). Focused on metabolic, neurologic and 
psychiatric traits for which coffee has been implicated (Table 3; 
Supplementary Table S32), there were additional sub-GW 
significant associations in published GWAS. Variants associated 
with higher coffee consumption increased adiposity (rs1481012, 
P = 4.85x10"^), birth weight (rs7800944, P = 2.10x 10"^), plasma 
high-density lipoprotein (HDL, rs7800944, P = 2.24x10"\ risk of 
Parkinson's disease (rs1481012, P = 7.11x10~\ reduced blood 
pressure (rs6265, P = 6.58x 10""^; rs2472297, P< 6.80x10"^ and 
rs9902453, P = 6.05 x 10"^), HDL (rs6968554, P= 1.18x 10"^), risk 
of major depressive disorder (rs1 7685, P = 6.98 x 1 0 ~ ^) and bipolar 
disorder (rs1260326, P = 2.31 x10"^). Associations with adiposity, 
birth weight, blood pressure, HDL and bipolar disorder remain 
significant after correcting for the number of SNPs tested. 

DISCUSSION 

Coffee's widespread popularity and availability has fostered public 
health concerns of the potential health consequences of regular 
coffee consumption. Findings from epidemiological studies of 
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colored according to locus. Candidate genes for loci identified in the current study were supplennented with known candidate genes related 
to caffeine pharnnacology (gray nodes). Edges indicate known interactions. 



coffee consumption and certain health conditions remain contro- 
versial.^ Knowledge of genetic factors contributing to coffee's 
consumption and physiological effects may inform the design and 
interpretation of population and clinical research on coffee.^ In the 
current report, we present results of the largest GWAS of coffee 
intake to-date and the first to include populations of African- 
American ancestry. In addition to confirming associations with 
AHR and CYP1A2, we have identified six new loci, not previously 
implicated in coffee drinking behavior. 

Our findings highlight an important role of the pharmacokinetic 
and pharmacodynamic properties of the caffeine component of 
coffee underlying a genetic propensity to consume the beverage. 
Loci near BDNF and SLC6A4 potentially impact consumption 
behavior by modulating the acute behavioral and reinforcing 
properties of caffeine. Others near AHR, CYP1A2, POR and ABCG2 
act indirectly by altering the metabolism of caffeine and thus the 
physiological levels of this stimulant. The strength of these four 
associations with coffee intake, along with results from pathway 
analysis showing significant enrichment for 'xenobiotic' genes, 
emphasize an especially pronounced role of caffeine metabolism 
in coffee drinking behavior. The current study is the first to 
link GCKR and MLXIPL variation to a behavioral trait. The non- 



synonymous rs1 260326 SNP in GCKR has been a GW signal for 
various metabolic traits particularly those reflecting glucose 
homeostasis (Table 3). GCKR variation may impact the glucose- 
sensing process of the brain^^ that may, in turn, influence central 
pathways responding to coffee constituents. Methylation quanti- 
tative trait loci and binding motif analysis suggest that PPMIG 
may be another candidate underlying the association between 
rs1 260326 and coffee consumption. Variants near MLXIPL have 
also topped the list of variants associated with plasma triglycer- 
ides (Table 3), but their link to coffee consumption remains 
unclear. Future studies on the potential pleiotropic effects of these 
two loci are clearly warranted. Interestingly, several candidate 
genes implicated in coffee consumption behavior, but not 
confirmed in our GWAS, interact with one or more of the eight 
confirmed loci (Figure 1). While these findings are encouraging for 
ongoing efforts they also emphasize the need to study sets or 
pathways of genes in the future. 

Specific SNPs associated with higher coffee consumption have 
previously been associated with smoking initiation, higher 
adiposity and fasting insulin and glucose but lower blood pressure 
and favorable lipid, inflammatory and liver enzyme profiles. 
Whether these relationships reflect pleiotropy, confounding or 
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Abbreviations: CEU, 


Centre d'Etude du Polymorphisme Humain; DBP, 
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density lipoprotein; LD, linkage disequilibrium; LDL, low-density lipopro- 


tein; SBP, systolic blood pressure; SHBG, sex hormone binding globulin; 


SNP, single-nucleotide polymorphism. ^Lead SNP allele associated with 
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associations (P < 6.25 x 10"^). See Supplementary Information for details 
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offer insight to the potential causal role coffee plays in these traits 
nnerits further investigation. Future research, particularly Mende- 
lian Randonnization and gene-coffee interaction studies, will need 
to consider the direct and indirect roles that each SNP has in 
altering coffee drinking behavior as well as the potential for 
interactions between loci (Figure 1). The heterogeneous effects 
specific to AHR- and CyP7/A2-coffee associations point to SNP- 
specific interactions with the environnnent or population char- 
acteristics that nnight also warrant consideration (Supplennentary 
Information). 

The strong cultural influences on nornns of coffee drinking may 
have reduced our power for loci discovery. This might, in part, 
underlie our lack of replication in a Pakistani population, wherein 
coffee consumption is extremely rare. Methodological limitations 
specific to our approach may also have reduced our power for loci 
discovery or precision in estimating effect sizes (Supplementary 
Information). For example, some studies collected coffee data in 
categories of cups per day (for example, 2-3 cups per day) 
rendering a less precise record of intake as well as a non-Gaussian 
distributed trait for analysis. The precise chemical composition of 
different coffee preparations is also not captured by standard food 
frequency questionnaire and is likely to vary within and between 
populations. Nevertheless, the eight loci together explain -1.3% 
of the phenotypic variance, a value at least as great as that 
reported for smoking behavior and alcohol consumption which 
are subjected to similar limitations in GWAS.'^^''^^ 

The additive genetic variance (or narrow-sense heritability) of 
coffee intake as estimated by GCTA in WGHS (7%) is considerably 
lower than estimates based on pedigrees (36-57%).^ The marked 
discrepancies between the GCTA and pedigree estimates of 
heritability may be due to one or more of the following: the 
potential contribution of rare variants to heritability (not captured 
by GCTA's 'chip-based heritability'), biases in pedigree analysis 
resulting in overestimates of heritability, differences in phenotype 
ascertainment or definition and cultural differences in the 
populations studied."^^ 

In conclusion, our results support the hypothesis that metabolic 
and neurological mechanisms of caffeine contribute to coffee 
consumption habits. Individuals adapt their coffee consumption 
habits to balance perceived negative and reinforcing symptoms 
that are affected by genetic variation. Genetic control of this 
potential 'titrating' behavior would incidentally govern exposure 
to other potentially 'bioactive' constituents of coffee that may be 
related to the health effects of coffee or other sources of caffeine. 
Thus, our findings may point to molecular mechanisms underlying 
inter-individual variability in pharmacological and health effects of 
coffee and caffeine. 
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